CUME_DIST in Databricks SQL

This page is a quick reference checkpoint for CUME_DIST in Databricks SQL: behavior, syntax rules, edge cases, and a minimal example; plus the official vendor documentation.


Function Details

CUME_DIST returns the cumulative distribution, showing the proportion of rows with values less than or equal to the current row.

Returns the cumulative distribution: number of rows with value <= current value divided by total rows in the partition; requires ORDER BY

If this behavior feels unintuitive, the tutorial below explains the underlying pattern step-by-step.

CUME_DIST() OVER ( [PARTITION BY ...] ORDER BY ... )

SELECT CUME_DIST() OVER (ORDER BY salary) AS salary_cume_dist, employee_id, salary FROM employees

What should you do next?

If you came here to confirm syntax, you’re done. If you came here to get better at window functions, choose your next step.

Understand the pattern

CUME_DIST is part of a bigger window-function pattern. If you want the “why”, start here: Percentile Distribution

Prove it with a real query

Reading docs is useful. Writing the query correctly under pressure is the skill.

Cumulative Spending by Species

Support Status

  • Supported: yes
  • Minimum Version: Databricks SQL Warehouses use a continuously updated engine without user-visible versions, so no minimum version can be specified.

Official Documentation

For the authoritative spec, use the vendor docs. This page is the fast “sanity check”.

View Databricks SQL Documentation →

Looking for more functions across all SQL dialects? Visit the full SQL Dialects & Window Functions Documentation.